Genome-Wide Comparative Analysis of the Fasciclin-like Arabinogalactan Proteins (FLAs) in Salicacea and Identification of Secondary Tissue Development-Related Genes

Fasciclin-like arabinogalactan proteins (FLAs) are a subclass of arabinogalactan proteins (AGPs) containing both AGP-like glycated domains and fasciclin (FAS) domains, which are involved in plant growth and development and synthesis of the cell wall. However, these proteins have not been identified or analyzed in willow, Salix, the sister genus of Populus. In this study, we performed a whole genome study of the FLA gene family of Salix suchowensis and compared it with the FLA gene family of Populus deltoides. The results showed the presence of 40 and 46 FLA genes in P. deltoides and S. suchowensis, distributed on 17 and 16 chromosomes, respectively. Four pairs of tandem repeat genes were found in willow, while poplar had no tandem repeat genes. Twelve and thirteen pairs of duplicated gene fragments were identified in poplar and willow, respectively. The multispecies phylogenetic tree showed that the FLA gene family could be divided into four groups (I–IV), with Group 1 showing significant expansion in woody plants. A gene expression analysis showed that PdeFLA19/27 in Group I of poplar was highly expressed, specifically during the secondary growth period of the stem and the rapid elongation of seed hairs. In the Group I genes of S. suchowensis, SsuFLA25/26/28 was also highly expressed during the secondary growth period, whereas increased expression of SsuFLA35 was associated with seed hair tissue. These results provide important clues about the differences in the FLA gene family during the evolution of herbs and woody plants, and suggest that the FLA gene family may play an essential role in regulating the secondary growth of woody plants. It also provides a reference for further studies on the regulation of secondary growth and seed hair development by FLA genes in poplar and willow.


Genome-Wide Identification and Sequence Analysis of Populus and Salix FLA genes
Based on the sequence structural characteristics of the FLA gene family, 40 and 46 FLA gene family members were identified in the genomes of P. deltoides and S. suchowensis, respectively. They were named PdeFLA1-PdeFLA40 and SsuFLA1-SsuFLA46, respectively, according to their positions on the chromosomes. All the genes contained the conserved FLA structural domain (Supplementary Tables S1 and S2). In P. deltoides, 29 proteins were predicted by the BIG-PI online program to have C-terminal glycophosphatidylinositol (GPI) anchors, and 34 N-terminal signal peptides were predicted by the SignalP online program (Supplementary Table S3). The proteins contained between 176 and 466 amino acids, with isoelectric points in the range of 4.50-10.77 and molecular weights of between 18,211.68 and 51,122.1 Da. In S. suchowensis, 32 proteins were predicted to have C-terminal GPI anchors by the BIG-PI prediction software, and 41 to were predicted to have N-terminal signal peptides by the SignalP online program (Supplementary Table S4). The amino acid lengths ranged from 206 to 468, the isoelectric points ranged from 4.33 to 10.81, and the molecular weights were between 21,708.68 and 51,646.93 Da. The gene names, gene IDs, chromosomal location, amino acid lengths, isoelectric point (pIs), and molecular weights (Mw) are listed in Table 1. This study comprehensively analyzed the FLA gene family in P. deltoides and S. suchowensis, identifying 40 and 46 genes, respectively. These gene numbers were 1.9-fold and 2.2-fold higher than the number of FLA genes in Arabidopsis [9]. Previous research has shown that early flowering species have a faster replacement rate than species with long generations [27,28]. This suggests that these two species have more genes than Arabidopsis, either because the poplar and willow genomes were larger or the replacement rate was much lower.

Chromosomal Distribution and Identification of Gene Duplication in Populus and Salix FLA Genes
To determine the distribution of FLA gene family members on the chromosomes of the P. deltoides and S. suchowensis genomes, and to evaluate the gene duplication relationships among the genes, we mapped the 40 poplar and 46 willow FLA genes onto the 19 chromosomes of their respective genomes using nucleotide sequence alignment. Four poplar genes and two willow genes failed to map to chromosomes ( Figure 1), but were mapped to unassembled scaffold fragments, probably due to incomplete genome assembly in these plant tissues.
We found that the distribution of FLA genes on the chromosomes in the two studied genomes was uneven. The maximum number of FLA gene members distributed on the poplar chromosomes I, VI, XIII, and XIX was four, and no FLA genes were distributed on chromosomes III and VII ( Figure 1A). There were relatively more FLA genes distributed on willow chromosomes VI, XIII, and XIX, with 7, 8, and 6 members ( Figure 1B), respectively. No FLA genes were distributed on chromosomes II, III, and V. It was also found that the same number of FLA genes were distributed in the same positions on chromosomes I, X, XI, XII, XIV, and XVII of the two genomes, and it was speculated that they were orthologous genes. We analyzed the orthologous relationship between P. deltoides and S. suchowensis. Thirty-six pairs of orthologous genes were identified (Supplementary Table S5). The synteny analysis showed that 28 pairs of FLA genes were distributed in the same chromosomal position of P. deltoides and S. suchowensis, five pairs of genes were distributed in different positions, and three pairs of genes could not be determined linearly at the time because they were not yet mounted to the chromosome (Figure 2A). Our results demonstrated that the distribution of FLA genes is similar overall on both genomes, although there were large distribution differences on certain chromosomes. For example, compared with the poplar genome, there were no genes on chromosome II in the willow genome ( Figure 1B), which might be because the evolution of willow species was faster than that of poplar species, and some genes that were not adapted to the environment were gradually eliminated. We found that the distribution of FLA genes on the chromosomes in the two studied genomes was uneven. The maximum number of FLA gene members distributed on the poplar chromosomes I, VI, XIII, and XIX was four, and no FLA genes were distributed on chromosomes III and VII ( Figure 1A). There were relatively more FLA genes distributed on willow chromosomes VI, XIII, and XIX, with 7, 8, and 6 members ( Figure 1B), respectively. No FLA genes were distributed on chromosomes II, III, and V. It was also found that the same number of FLA genes were distributed in the same positions on chromosomes I, X, XI, XII, XIV, and XVII of the two genomes, and it was speculated that they were orthologous genes. We analyzed the orthologous relationship between P. deltoides and S. suchowensis. Thirty-six pairs of orthologous genes were identified (Supplementary Table S5). The synteny analysis showed that 28 pairs of FLA genes were distributed in the same chromosomal position of P. deltoides and S. suchowensis, five pairs of genes were distributed in different positions, and three pairs of genes could not be determined linearly at the time because they were not yet mounted to the chromosome (Figure 2A). Our results Segmental and tandem duplication are the main drivers of gene family expansion. We found four pairs of tandem duplicated genes (SsuFLA10, SsuFLA11 and SsuFLA12; SsuFLA14 and SsuFLA15; SsuFLA25, SsuFLA26, SsuFLA27, SsuFLA28 and SsuFLA29; and SsuFLA43, SsuFLA44, and SsuFLA45) in S. suchowensis, distributed on chromosomes VI, VII, XIII, and XIX ( Figure 1B). No tandem duplicated genes were found in P. deltoides. Twelve segmental duplication events were identified in P. deltoides, involving 23 genes ( Figure 2B and Supplemental Table S6). Thirteen segmental duplication events were identified in S. suchowensis, involving 25 genes ( Figure 2C and Supplemental Table S6). All gene pairs, apart from SsuFLA43/45 in S. suchowensis, which showed complete sequence identity, had Ka/Ks values less than 1 (Supplemental Table S6), indicating that the FLA gene family underwent purifying selection. A comparison of the gene duplication results between poplar and willow showed that the number of segmental duplication events was similar, but the location of segmental duplication differed greatly, with only six segmental duplications having the same chromosomal location. In addition, tandem duplication of FLA genes was more frequent in willow than poplar. The above results suggest that FLA genes may have undergone six segmental duplications within the common ancestral species of poplar and willow. After divergence of the two species, the average rate of gene replacement was significantly higher in the willow genome than in the poplar genome [27,28], and the rate of evolution was also higher [26,29]. This rapid expansion allowed the preservation of these newly generated species. This may also be why willow trees had more FLA genes than poplars. demonstrated that the distribution of FLA genes is similar overall on both genomes, although there were large distribution differences on certain chromosomes. For example, compared with the poplar genome, there were no genes on chromosome II in the willow genome ( Figure 1B), which might be because the evolution of willow species was faster than that of poplar species, and some genes that were not adapted to the environment were gradually eliminated. Segmental and tandem duplication are the main drivers of gene family expansion. We found four pairs of tandem duplicated genes (SsuFLA10, SsuFLA11 and SsuFLA12; SsuFLA14 and SsuFLA15; SsuFLA25, SsuFLA26, SsuFLA27, SsuFLA28 and SsuFLA29; and SsuFLA43, SsuFLA44, and SsuFLA45) in S. suchowensis, distributed on chromosomes VI, VII, XIII, and XIX ( Figure 1B). No tandem duplicated genes were found in P. deltoides. Twelve segmental duplication events were identified in P. deltoides, involving 23 genes ( Figure 2B and Supplemental Table S6). Thirteen segmental duplication events were identified in S. suchowensis, involving 25 genes ( Figure 2C and Supplemental Table S6). All gene pairs, apart from SsuFLA43/45 in S. suchowensis, which showed complete sequence identity, had Ka/Ks values less than 1 (Supplemental Table S6), indicating that the FLA gene family underwent purifying selection. A comparison of the gene duplication results between poplar and willow showed that the number of segmental duplication events was similar, but the location of segmental duplication differed greatly, with only six segmental duplications having the same chromosomal location. In addition, tandem duplication of FLA genes was more frequent in willow than poplar. The above results suggest that FLA genes may have undergone six segmental duplications within the common ancestral species of poplar and willow. After divergence of the two species, the average rate of gene replacement was significantly higher in the willow genome than in the poplar genome [27,28], and the rate of evolution was also higher [26,29]. This rapid expansion allowed

Phylogeny, Conserved Gene Structures, and Protein Motif Analysis of Populus and Salix FLA Genes
Both poplar and willow evolved from the same ancestral species, and during their respective evolution after species divergence, both retained orthologous genes inherited from their ancestors. At the same time, replication occurred in both genomes, generating new paralogous genes. To compare the sequences and structures of these homologous genes, we constructed a phylogenetic tree based on the full-length sequences of the FLA genes. We combined gene structure and conserved sequence prediction to present a comprehensive picture of the homology of FLA genes between P. deltoides and S. suchowensis. The analysis showed that there were 22 pairs of P. deltoides and S. suchowensis FLA genes at the extreme ends of the evolutionary tree branches, including PdeFLA2 and SsuFLA46, PdeFLA35 and SsuFLA27, and PdeFLA18 and SsuFLA15 ( Figure 3A). In addition, multiple willow FLA genes were found to cluster with one poplar FLA gene on one evolutionary branch ( Figure 3A). For example, SsuFLA43, SsuFLA44, and SsuFLA45 clustered with PdeFLA36 on the same branch, and this, combined with their chromosomal distributions, confirmed that SsuFLA43, SsuFLA44, and SsuFLA45 were duplicates of genes generated by tandem repeats during willow evolution.

Phylogenetic Analysis and Functional Prediction of the FLA Gene Family
We have investigated the phylogenetic relationships of the FLA proteins in different plants and constructed a phylogenetic tree including five woody plants (P. deltoides, S. suchowensis, Betula platyphylla, Quercus rubra L., and Cinnamomum kanehirae) and five herbaceous plants (Setaria italica, Zea mays L., Cucumis sativus L., Oryza sativa L., and Arabidopsis thaliana). In all, 277 FLA gene members were classified using the neighbor-joining (NJ) method. The phylogenetic tree showed that the FLA gene family was divided into four evolutionary branches, which were named Group I, II, III, and IV, respectively ( Figure 4).
The statistics of the gene members in each group showed that Group I was the largest group, containing 123 genes, while Group IV was the smallest, with 26 genes. The number of genes in each group was counted for each species used in the construction of the evolutionary tree, and it was found that the woody plants had more Group I genes than herbaceous plants. Group I genes made up an average of 50.49% of the FLA genes in woody plants, while the average proportion in herbaceous plants was 33.16% (Supplementary  Table S8). Additionally, Group I FLA made up an average of 0.04584% of all genes in the genomes of woody plants, but only 0.02118% in herbaceous plants (Supplementary Table  S8). The differences in these proportions between woody and herbaceous plants were found to be highly significant (Supplementary Figure S1). This suggests that among the plants selected for evolutionary analysis, FLA genes in the Group I branch of woody plants underwent a major expansion. It also implies that FLA genes play important roles in secondary growth in woody plants. Normally, the FLA binding domain shows that introns were lost during plant evolution. Nevertheless, in our analysis, we found that 34.88% (30/86) of P. deltoides and S. suchowensis FLA genes contained one intron, 11.63% (10/86) had two introns, and 2.33% (2/86) had more than two introns, with only 51.16% (44/86) having no introns ( Figure 3B). Overall, the intron structure varied widely among the FLA gene family members. However, combined with the phylogenetic tree analysis, our results indicated that genes on the same evolutionary branches of the P. deltoides and S. suchowensis FLA gene families had similar intron patterns. Furthermore, while some of the closest genes showed similar structures, a small subset showed different intron-exon arrangements. For example, SsuFLA6 had no introns, while its nearby homologous gene, PdeFLA7, contained three, even though their evolutionary relationships reached 99% bootstrap values, respectively.
We further predicted the conserved motifs in the P. deltoides and S. suchowensis FLA proteins using the MEME online program to determine the specific regions of the FLA proteins. A comparison of the 20 conserved motifs found in P. deltoides and S. suchowensis ( Figure 3C) revealed that the major motifs of the two species were similar, and the composition of the motifs was not significantly different. The lengths of the 20 conserved motifs ranged from 11 to 50 amino acids (Supplementary Table S7), and the number of motifs in each FLA protein ranged from 1 to 11. Most FLA proteins had motifs 1, 2, 3, 4, 8, and 9 ( Figure 3C). Motifs 1, 2, 3, and 4 were included within the fasciclin domain, and motifs 8 and 9 were observed in the AGP-like glycosylation regions. However, while motifs 1, 2, 3, and 4 followed the fasciclin domain in the sequence and were particularly conserved in most clades, some motifs were selectively distributed between specific clades of the phylogenetic tree. For example, PdeFLA32 and SsuFLA40 contained only motif 4, and SsuFLA41, SsuFLA30, and PdeFLA33 contained only motif 18. These results suggested that these conserved motifs played key roles in specific or similar functions.

Phylogenetic Analysis and Functional Prediction of the FLA Gene Family
We have investigated the phylogenetic relationships of the FLA proteins in different plants and constructed a phylogenetic tree including five woody plants (P. deltoides, S. suchowensis, Betula platyphylla, Quercus rubra L., and Cinnamomum kanehirae) and five herbaceous plants (Setaria italica, Zea mays L., Cucumis sativus L., Oryza sativa L., and Arabidopsis thaliana). In all, 277 FLA gene members were classified using the neighbor-joining (NJ) method. The phylogenetic tree showed that the FLA gene family was divided into four evolutionary branches, which were named Group I, II, III, and IV, respectively (Figure 4).  Combined with the functional conservation of homologous genes, genes with similar or identical functions were classified into the same groups, providing a reliable reference basis for predicting the functions of related genes in the gene family. To predict the biological functions of the poplar and willow FLA genes, we referred to the functions of FLA genes that have been verified in Arabidopsis. The gene members in Group I belonged to The statistics of the gene members in each group showed that Group I was the largest group, containing 123 genes, while Group IV was the smallest, with 26 genes. The number of genes in each group was counted for each species used in the construction of the evolutionary tree, and it was found that the woody plants had more Group I genes than herbaceous plants. Group I genes made up an average of 50.49% of the FLA genes in woody plants, while the average proportion in herbaceous plants was 33.16% (Supplementary  Table S8). Additionally, Group I FLA made up an average of 0.04584% of all genes in the genomes of woody plants, but only 0.02118% in herbaceous plants (Supplementary  Table S8). The differences in these proportions between woody and herbaceous plants were found to be highly significant (Supplementary Figure S1). This suggests that among the plants selected for evolutionary analysis, FLA genes in the Group I branch of woody plants underwent a major expansion. It also implies that FLA genes play important roles in secondary growth in woody plants.
Combined with the functional conservation of homologous genes, genes with similar or identical functions were classified into the same groups, providing a reliable reference basis for predicting the functions of related genes in the gene family. To predict the biological functions of the poplar and willow FLA genes, we referred to the functions of FLA genes that have been verified in Arabidopsis. The gene members in Group I belonged to the same subgroup as the AtFLA11 and AtFLA12 genes that are associated with xylem development. Related studies have found a correlation between the abundance of AtFLA11 and AtFLA12 transcripts containing a single FAS domain and the onset of secondary cell wall cellulose synthase expression in Arabidopsis stems [30,31]. In addition, the phenotypes of AtFLA11 mutants showed the presence of a mild collapsed vessel phenotype and reduced stem cellulose content [30][31][32]. These analyses indicated that the FLA members in Group I were associated with secondary wall and cellulose synthesis in the stem. Group II proteins fall into the same subgroup as AtFLA1. Several studies have described a T-DNA insertion mutant in the AtFLA1 gene under standard growth conditions, showing that although the FLA1-1 mutants had no distinct phenotype, the gene may play a role in the ability of the callus to induce regeneration, shown by in vitro shoot induction assays [33]. Therefore, we speculate that the Group II proteins might be involved in the development of buds. AtFLA16 belongs to Group III, and studies on AtFLA16 mutants showed that the deletion of the gene resulted in reduced stem lengths and altered biomechanical properties [22]. We thus speculate that Group III members are responsible for maintaining and regulating stem strength and support. Group IV includes AtFLA3, and overexpression of the gene led to defective elongation of stamen filaments and reduced female fertility resulting in siliques with low seed settings, suggesting that AtFLA3 was involved in microspore development and might affect the pollen lining by participating in cellulose deposition [34]. Therefore, we speculate that the members of Group IV might have functions involved in the development of microspores.
A combination of the phylogenetic analysis and functional predictions of FLA genes strongly suggested that genes in Group I were likely to be associated with cell wall synthesis and secondary growth in plants. In particular, two sets of tandem repeat genes from willow (SsuFLA10, SsuFLA11, and SsuFLA12 and SsuFLA43, SsuFLA44, and SsuFLA45) were included in Group I (Figure 4). In addition, it has been shown that a subset of FLAs containing a single FAS domain contributed to plant stem strength by affecting cellulose deposition. It has also been suggested that this influences the stem elastic modulus by affecting the integrity of the cell wall matrix [20]. We further found that FLA genes in Group I contained only one FAS domain and had undergone duplication. Thus, these analyses supported our hypothesis that Group I genes were involved in plant cellulose and cell wall formation, were significantly amplified in woody plants, and were involved in wood formation. These results suggested that these genes were good candidates for further functional verification and phylogenetic analysis.

Identification of FLA Candidate Genes Related to Stem Development in Populus and Salix
Wood production relies on the activity of vascular bundle-forming layers. A primary vascular system consists of a discrete set of vascular bundles that contain bundle-forming layers, primary phloem, and primary xylem. The primary vascular bundle originates from proto-formation layer cells on the periphery of the rib region of the shoot apical meristem (SAM). In perennial woody plants, secondary growth arises from the meristem activity of the vascular formation layer, and the bundle-like formation layer located at the center of the primary vascular bundle extends to the interbundled region, tangentially generating an interbundled formation layer to form a vascular formation layer ring [35,36]. The meristem activity of the vascular forming layer then leads to the continuous production of cylindrical secondary vascular tissue (wood). Therefore, we selected tissues with different degrees of lignification in P. deltoides and S. suchowensis stems and analyzed the expression patterns of Group I members in the FLA gene family. Since some genes had high sequence similarity, it was impossible to design specific primers to distinguish them, so universal primers were designed for these genes to analyze their expression patterns. For example, PdeFLA19 and PdeFLA27 were 91.42% similar; PdeFLA35 and PdeFLA38 were 96.09% similar; SsuFLA43, SsuFLA44 and SsuFLA45 were 97.99% alike; and SsuFLA25, SsuFLA26 and SsuFLA28 were 93.38% identical.
In P. deltoides, five parts of the stem, internodes 1-2, 3-4, 5-7, 8-9, and 10-11, were selected from the stem tip downwards and named P_S1, P_S2, P_S3, P_S4, and P_S5, respectively ( Figure 5A). Expression patterns were analyzed for members of some FLA gene families in Group I. RT-qPCR analysis showed ( Figure 5C) that PdeFLA2/15/19/20/22/27/35/36/38 were expressed significantly in the elongated region of the stem (P_S3) but were significantly reduced in the tip of the stem (P_S1) and the secondary growth region of the stem (P_S4). On the other hand, PdeFLA7/14/37 was expressed mainly in young stems (P_S2) and elongated regions of stems (P_S3). In particular, PdeFLA19/27 had the highest relative expression. These results suggested that PdeFLA19/27 was associated with secondary thickening growth and could thus serve as a major candidate gene for later studies on poplar wood growth and development.
We selected tissues from S. suchowensis, as we did in poplar. They were named S_S1 to S_S5 and were analyzed by RT-qPCR to determine the expression patterns of the Group I FLA family members. The results of the stem cross-sections showed that the willow had a faster lignification rate than the poplar in the same nodal tissues, especially in S4 and S5, where the secondary vascular tissue formed a thicker structure ( Figure 5B). The RT-qPCR analysis showed that a total of 17 genes ( Figure 5D), all members of the Group I branch in willow, were positively correlated with stem node lignification and had a very low expression in shoot leaves and stem tips (S_S1). The expression of Ssu-FLA5/6/18/23/27 increased gradually from S_S1 to S_S5, while the expression levels of SsuFLA21/22/25/26/28/29/34/35/43/44/45/46 were highest in S_S4 tissue, but tended to decrease in S_S5 tissue ( Figure 5D). Therefore, it was inferred that rapid lignification of willow stems occurred at internodes 8-9 (S_S4). The Group I FLA genes showed greater functional redundancy in the regulation of stem lignification function. Of these, the three tandem repeats of SsuFLA25/26/28 were expressed at high levels and played a dominant role in willow stem lignification. These results suggested that these could be important candidate genes for further investigation of wood development and formation in willow.
in S_S5 tissue ( Figure 5D). Therefore, it was inferred that rapid lignification of willow stems occurred at internodes 8-9 (S_S4). The Group I FLA genes showed greater functional redundancy in the regulation of stem lignification function. Of these, the three tandem repeats of SsuFLA25/26/28 were expressed at high levels and played a dominant role in willow stem lignification. These results suggested that these could be important candidate genes for further investigation of wood development and formation in willow.

Identification of FLA Candidate Genes Related to Seed Hair Development in Populus and Salix
Seed hairs are composed of flocculent fibers produced by poplar and willow trees to assist in seed dispersal after reaching sexual maturity. They are formed by the projection of placentation epidermal cells and their cell walls are largely composed of cellulose, hemicellulose, and pectin. As in the investigation of the regulatory roles of FLA genes in plant cell walls, we collected tissues from different periods of poplar seed hair development to analyze the expression of the Group I genes. The results showed that poplar seed hairs produced obvious flocculent fibers that elongated rapidly 3 days after pollination (DAP). Furthermore, the length of seed hairs was sufficient to completely wrap the ovules by 5 DAP, indicating that the 3-5 DAP period was associated with the rapid elongation of seed hairs ( Figure 6A). DAP, indicating that the 3-5 DAP period was associated with the rapid elongation of seed hairs ( Figure 6A). Furthermore, the RT-qPCR analysis showed that PdeFLA2/18/19/27 genes were highly expressed specifically at 3 DAP, which was consistent with the rapid elongation development of seed hairs. The highest relative expression was seen in the PdeFLA19/27 genes ( Figure 6C). We therefore speculate that these genes have important regulatory roles in the development of cell walls during the rapid elongation of poplar seed hairs.  Table S9). The X-axis represents different tissues and the Y-axis represents relative expression. Values shown are means ± SD of three biological replicates (** p < 0.01, two-sided Student's t-test).
Furthermore, the RT-qPCR analysis showed that PdeFLA2/18/19/27 genes were highly expressed specifically at 3 DAP, which was consistent with the rapid elongation development of seed hairs. The highest relative expression was seen in the PdeFLA19/27 genes ( Figure 6C). We therefore speculate that these genes have important regulatory roles in the development of cell walls during the rapid elongation of poplar seed hairs.
In S. suchowensis, we collected pollinated and unpollinated florets in the same catkin to analyze the expression of Group I FLA genes. The results showed that pollinated florets developed numerous seed hairs on 7 DAP, while unpollinated florets did not produce seed hairs ( Figure 6B). Furthermore, RT-qPCR analysis showed that 15 of the 17 gene Group I genes were expressed at significantly higher levels in pollinated florets than in unpollinated tissue, except for SsuFLA22/34 ( Figure 6D). Of these, the SsuFLA35 gene showed the highest relative expression in pollinated florets, and it is speculated that it may be involved in regulating the development of willow seed hair tissue.

Genome-Wide Identification and Sequence Analysis of Populus and Salix FLA Genes
The genome sequences of P. deltoides (PRJNA598948) and S. suchowensis (ASM1755242v1) were downloaded from the NCBI website (https://www.ncbi.nlm.nih.gov/ accessed on 8 March 2021). The fasciclin domain genome sequence (PF02469) was downloaded from the Pfam database (http://pfam.xfam.org/ accessed on 8 March 2021). HMM3.0 software was used to screen proteins containing this domain in the genome sequences of P. deltoides and S. suchowensis [37]. In addition, a local database was constructed from the whole genome sequences of P. deltoides and S. suchowensis, and a BLASTP search was performed using an FLA protein sequence of A. thaliana. The candidate protein sequences identified by the above two methods were combined to remove duplicates, incomplete sequencing, and protein sequences without complete coding frames. Candidate proteins were analyzed using the Pfam (http://pfam.xfam.org/ accessed on 8 March 2021, PF02469) [38] and SMART (http://smart.embl.de/ accessed on 8 March 2021) [39] databases, and protein sequences with incomplete domains were removed.
Prediction of the N-terminal signal peptides was performed on the SignalP 5.0 website (https://services.healthtech.dtu.dk/service.php?SignalP-5.0 accessed on 8 March 2021) [40]. Prediction of the attachment sites of the C-terminal GPI anchor was performed in the BIG-PI database (http://mendel.imp.ac.at/sat/gpi/gpi_server.html accessed on 8 March 2021) [41]. The fasciclin domain, N-terminal signal peptide, and C-terminal GPI anchor attachment site were removed for further screening for AGP-like glycosylation sites. Finally, the sequences containing the fasciclin domain and AGP-like glycosylation regions were considered the P. deltoides and S. suchowensis FLA gene family sequences. ExPASy (https://web.expasy.org/compute_pi accessed on 8 March 2021) was used for the prediction of the pI (isoelectric point) and molecular weight (Mw) [42].
The nucleotide and protein sequences of the identified FLA gene family members were aligned with the whole genome sequences of P. deltoides and S. suchowensis using BLAST, and the corresponding genome sequences, chromosomal positions, and exon/intron distribution patterns of each member were obtained. In addition, the genome sequences of six other species were downloaded from different websites,  [3] were also downloaded (Supplementary Sequence S1).

Chromosomal Distribution of FLA Genes
Using the annotation files of the P. deltoides and S. suchowensis genomes, the chromosome location information of the FLA gene family was extracted, and the length of each chromosome was obtained. MapChart software was used to map the chromosomal locations and relative distances in the FLA genes. Tandem duplications of FLA genes were confirmed based on two criteria: (1) genes distributed in the 100 kb range on chromosomes and (2) similarity of >70% of the sequence alignment of the two genes [43,44]. Synteny analysis between species and gene duplication within species were performed using MCScanX software. Visual mapping was completed using Tbtools and KaKs_Calculator 2.0 was used to calculate synonymous (Ks) and non-synonymous (Ka) substitution of FLA gene pairs.

Phylogenetic Analysis and Functional Prediction of FLA Genes
Multiple sequence comparisons of FLA full-length protein sequences of P. deltoides and S. suchowensis were performed using ClustalX 2.1 with the default parameters [45]. Neighbor-joining (NJ) phylogenetic trees were constructed using MAGE 11 with parameters set to Poisson correction, pairwise deletion, and bootstrap analysis with 1000 replicates [46]. All FLA protein sequences from P. deltoides, S. suchowensis, C. avellana, Q. rubra L., C. kanehirae, S. italica, Z. mays L., C. sativus L., O. sativa L., and A. thaliana were clustered by MEGA 11 software for multiple sequence alignment and construction of the phylogenetic tree. The phylogenetic tree was visualized using the online program iTOL (https://itol. embl.de/login.cgi accessed on 11 August 2021), and a folded phylogenetic tree consisting of 10 species was constructed [47].

Analysis of Gene Structure and Protein Motif Identification
The intron and exon positions of the FLA genes were extracted, and the number and length of introns and exons were counted in generic feature format (GFF) files released for the whole genomes of P. deltoides and S. suchowensis. The exon and intron structures were displayed through the online program Gene Structure Display Server (http://gsds. gao-lab.org/index.php accessed on 8 March 2021) [48]. The MEME online program (https: //meme-suite.org/meme/tools/meme accessed on 8 March 2021) was used to identify conserved motifs in the FLA proteins. The applied inclusion criteria were: (1) each FLA protein sequence appeared with at least 1 and a maximum of 20 motifs; (2) each motif had a minimum of 2 and a maximum of 250 amino acids in the sequence; and (3) other default parameters were used. The IDs of FLA members were kept the same as previously named.

Preparation of Plant Materials
Stems with different degrees of lignification were collected. Six-month-old P. deltoides and S. suchowensis were selected at the Plant Growth Laboratory of Nanjing Forestry University (China, Nanjing). Five parts of the stem, internodes 1-2, 3-4, 5-7, 8-9, and 10-11, were selected from the stem tip downwards and named S1, S2, S3, S4, and S5. Tissues at different stages of poplar seed hair development were also collected; all were from female P. deltoides flowering branches. Hydroponics was used to promote flowering. Pollination was performed with the stigma fully expanded, termed 0 days after pollination (0 DAP). Inflorescence ovaries were collected at −1 DAP, 0 DAP, 3 DAP, and 5 DAP in sequence. For the collection of willow seed hair tissue, male and female flowering branches of S. suchowensis were selected to promote flowering using hydroponics. After opening of the female inflorescences, some of the florets were pollinated. Seven days after pollination, pollinated and unpollinated florets in the same inflorescence were collected separately. Three replicates of all samples were collected and were immediately frozen in liquid nitrogen after collection and stored at −80°C for RNA extraction.

Primer Design, RNA Extraction, and qRT-PCR
The primers used for analyzing the expression of FLA genes in P. deltoides and S. suchowensis were designed using Primer 5.0. The specificity of the primers was verified by agarose gel electrophoresis and the primer sequences are shown in Supplementary  Table S10. PtUBQ (Potri.015G013600.1) [49] and ACT7 (SapurV1A.0231s0320.1) [50] were used as internal reference genes for P. deltoides and S. suchowensis, respectively. Total RNA was extracted from the P. deltoides and S. suchowensis tissues using the RNA Easy Fast Plant Tissue Kit (TIANGEN, Nanjing, China). One microgram of RNA from each sample was reverse transcribed to cDNA using the One-Step gDNA Removal and cDNA Synthesis SuperMix (TIANGEN, Nanjing, China). Quantitative RT-PCR was performed on a 7500 Fast Real-Time PCR System (Applied Biosystems, USA). The amplification reaction system was as follows: 1 µL each of forward and reverse primers (10 umol/L), 2 µL cDNA (5-fold dilution), 10 µL PowerUpTM SYBRTM Green Master Mix (Applied Biosystems), and RNase-free water was added to make the solution up to 20 µL. The amplification procedure was as follows: 95 • C for 15 s, 60 • C for 30 s, and 72 • C for 30 s, with 40 cycles. Finally, the relative expression of each FLA gene was calculated using the CT method [51].

Conclusions
In this study, we performed a systematic analysis of the FLA gene family in P. deltoides and S. suchowensis, including the identification of the FLA genes and analysis of their structure and chromosomal distribution, as well as the identification of conserved domains and the evolution of the gene family in multiple species. The analysis revealed that FLA genes on the Group I branch of the phylogenetic tree were involved in the regulation of secondary growth processes, such as lignification in poplar and willow, and further showed that these genes had undergone significant expansion in woody plants, suggesting that this group of genes may have been important in the evolution of woody plants from herbaceous plants. Finally, analysis of the expression of the Group I genes in different tissues confirmed that the PdeFLA19/27 gene was highly expressed in poplar, specifically during stem lignification and seed hair development, suggesting its important regulatory role in stem or seed hair development. Furthermore, it was verified that a set of duplicated genes, SsuFLA25/26/28, in willow were important candidates for stem lignification development, and that SsuFLA35 may be involved in the regulation of germplasm hair development. In summary, this study priovides a multi-faceted reference for subsequent studies on the functions of FLA genes in poplar and willow.